Atharv Pramod Jangam Data Visualization Final Project¶

Is climate change real?¶

let's find it out¶

In [10]:
#importing all the libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

First let's look at the data of global land temprature change from 1900-2022¶

In [11]:
global_Land = pd.read_csv("1900-2022 (global land).csv")
In [12]:
global_Land.head(10)
Out[12]:
Year Value
0 1900 -1.05
1 1901 -0.42
2 1902 0.17
3 1903 -0.19
4 1904 -0.88
5 1905 -0.30
6 1906 -0.25
7 1907 -0.66
8 1908 -0.32
9 1909 -0.95
In [13]:
global_Land = global_Land.rename(columns={'Value': 'Temperature'})
global_Land.head()
Out[13]:
Year Temperature
0 1900 -1.05
1 1901 -0.42
2 1902 0.17
3 1903 -0.19
4 1904 -0.88
In [15]:
!pip install plotly
Collecting plotly
  Downloading plotly-5.15.0-py2.py3-none-any.whl (15.5 MB)
     ---------------------------------------- 15.5/15.5 MB 6.2 MB/s eta 0:00:00
Collecting tenacity>=6.2.0
  Downloading tenacity-8.2.2-py3-none-any.whl (24 kB)
Requirement already satisfied: packaging in c:\users\athar\appdata\local\programs\python\python310\lib\site-packages (from plotly) (21.3)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in c:\users\athar\appdata\local\programs\python\python310\lib\site-packages (from packaging->plotly) (3.0.8)
Installing collected packages: tenacity, plotly
Successfully installed plotly-5.15.0 tenacity-8.2.2
[notice] A new release of pip is available: 23.0.1 -> 23.2.1
[notice] To update, run: python.exe -m pip install --upgrade pip
In [16]:
import plotly.express as px
fig = px.bar(global_Land, x='Year', y='Temperature', color='Temperature',
             labels={ 'Temperature' : 'Temperature (°C)'

             },
             title="Global Land Temperature Anomalie")
fig.show()

#saving in html
fig.write_html("bar graph.html")

Data source = "https://www.ncei.noaa.gov/access/monitoring/climate-at-a-glance/global/time-series"

From the Graph above we can clearly see the change in the global land temperature and say that the from year 1900-2022 there is a huge change in the temperature. The major change came in the year 1980 and it is still increasing.

Now let's find out for specific country and what could be reason behind it.

Average temprature change is India¶

In [17]:
India_temp = pd.read_csv("month_seas_ann_min_temp_India_1901_2016 (1).csv")
In [18]:
India_temp.head()
Out[18]:
YEAR ANNUAL
0 1901 19.51
1 1902 19.44
2 1903 19.25
3 1904 19.22
4 1905 19.03
In [19]:
fig = px.line( India_temp, x='YEAR', y='ANNUAL',
             labels={ 'YEAR' : 'Year',
                      'ANNUAL' : 'Annual Temperature (°C)'
                 },
             title="Annual Average Temperature of India")
fig.update_traces(line_color='red')
fig.show()

Data source = "https://data.gov.in/catalog/all-india-seasonal-and-annual-minmax-temperature-series"

The average temperature of India has increased from 1990 to 2016 by 2 degree Celsius. This trend of increasing annual temperature for India is in line with the increasing global surface temperatures. The major change took place in 1980 as the urbanization came to it's peak after the independance of India in 1950.

Let's analyse even further¶

In [20]:
India_df = pd.read_csv("Weather Data in India from 1901 to 2017.csv", index_col=0)

India_df.head()
Out[20]:
YEAR JAN FEB MAR APR MAY JUN JUL AUG SEP OCT NOV DEC
0 1901 17.99 19.43 23.49 26.41 28.28 28.60 27.49 26.98 26.26 25.08 21.73 18.95
1 1902 19.00 20.39 24.10 26.54 28.68 28.44 27.29 27.05 25.95 24.37 21.33 18.78
2 1903 18.32 19.79 22.46 26.03 27.93 28.41 28.04 26.63 26.34 24.57 20.96 18.29
3 1904 17.77 19.39 22.95 26.73 27.83 27.85 26.84 26.73 25.84 24.36 21.07 18.84
4 1905 17.40 17.79 21.78 24.84 28.32 28.69 27.67 27.47 26.29 26.16 22.07 18.71
In [21]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots

India_df['Yearly Mean'] = India_df.iloc[:,1:].mean(axis=1)
fig = go.Figure(data=[
    go.Scatter(name='Yearly Tempratures' , x=India_df['YEAR'], y=India_df['Yearly Mean'], mode='lines'),
    go.Scatter(name='Yearly Tempratures' , x=India_df['YEAR'], y=India_df['Yearly Mean'], mode='markers')
])
fig.update_layout(title='Yearly Mean Temprature :',
                 xaxis_title='Time', yaxis_title='Temprature (°C)')
fig.show()

fig = px.scatter(India_df,x = 'YEAR', y = 'Yearly Mean', trendline = 'lowess')
fig.update_layout(title='Trendline Over The Years :',
                 xaxis_title='Time', yaxis_title='Temprature(°C)')
fig.show()

Data source = "https://data.gov.in/catalog/all-india-seasonal-and-annual-minmax-temperature-series"

In this graphs we can clearly see that after 1995 and 2015 there is a huge change in the temperature. We are using Locally weighted scatterplot smoothing, also known as LOWESS (Locally Weighted Scatterplot Smoothing), is a common method used in regression analysis that draws a smooth line through a timeplot or scatter plot to help you detect relationships between variables and predict trends.

Visualizing according to change in the temperature according to the states¶

In [22]:
state_df = pd.read_csv("GlobalLandTemperaturesByState.csv")
In [23]:
state_df.head()
Out[23]:
dt AverageTemperature AverageTemperatureUncertainty State Country
0 1855-05-01 25.544 1.171 Acre Brazil
1 1855-06-01 24.228 1.103 Acre Brazil
2 1855-07-01 24.371 1.044 Acre Brazil
3 1855-08-01 25.427 1.073 Acre Brazil
4 1855-09-01 25.675 1.014 Acre Brazil
In [24]:
state_df.dtypes
Out[24]:
dt                                object
AverageTemperature               float64
AverageTemperatureUncertainty    float64
State                             object
Country                           object
dtype: object
In [25]:
state_df.shape
Out[25]:
(645675, 5)

Cleaning the data¶

In [26]:
state_df.isnull().sum()
Out[26]:
dt                                   0
AverageTemperature               25648
AverageTemperatureUncertainty    25648
State                                0
Country                              0
dtype: int64
In [27]:
#drop NA values as our dataset is huge
state_df.dropna(how ='any', axis =0)
Out[27]:
dt AverageTemperature AverageTemperatureUncertainty State Country
0 1855-05-01 25.544 1.171 Acre Brazil
1 1855-06-01 24.228 1.103 Acre Brazil
2 1855-07-01 24.371 1.044 Acre Brazil
3 1855-08-01 25.427 1.073 Acre Brazil
4 1855-09-01 25.675 1.014 Acre Brazil
... ... ... ... ... ...
645669 2013-04-01 15.710 0.461 Zhejiang China
645670 2013-05-01 21.634 0.578 Zhejiang China
645671 2013-06-01 24.679 0.596 Zhejiang China
645672 2013-07-01 29.272 1.340 Zhejiang China
645673 2013-08-01 29.202 0.869 Zhejiang China

620027 rows × 5 columns

In [28]:
state_df.rename(columns={'dt':'Date'}, inplace = True)
state_df.drop(columns='AverageTemperatureUncertainty', inplace = True)
state_df.head()
Out[28]:
Date AverageTemperature State Country
0 1855-05-01 25.544 Acre Brazil
1 1855-06-01 24.228 Acre Brazil
2 1855-07-01 24.371 Acre Brazil
3 1855-08-01 25.427 Acre Brazil
4 1855-09-01 25.675 Acre Brazil
In [29]:
#changing the data type of date format
from datetime import datetime, timedelta
state_df['Date'] = pd.to_datetime(state_df['Date'])
In [30]:
state_df.dtypes
Out[30]:
Date                  datetime64[ns]
AverageTemperature           float64
State                         object
Country                       object
dtype: object

Plotting interactive Tree map to check which state had the most average temperature¶

In [31]:
country_state_temp = state_df.groupby(by = ['Country','State']).mean().reset_index().sort_values('AverageTemperature',ascending=False).reset_index()
country_state_temp
country_state_temp["world"] = "world"
fig6 = px.treemap(country_state_temp.head(200), path=['world', 'Country','State'], values='AverageTemperature',
                  color='State',color_continuous_scale='RdBu')
fig6.show()

Here in this interactive treemap we can see the states which has most average temprature and gives a indetail view on it.

In [32]:
#droping the rest of the countries data
df_filtered = state_df[state_df['Country'] == 'India']

# Print the new dataframe
print(df_filtered.head(15))
            Date  AverageTemperature                State Country
24709 1796-01-01              26.534  Andaman And Nicobar   India
24710 1796-02-01              26.294  Andaman And Nicobar   India
24711 1796-03-01              26.180  Andaman And Nicobar   India
24712 1796-04-01              27.942  Andaman And Nicobar   India
24713 1796-05-01              28.651  Andaman And Nicobar   India
24714 1796-06-01              28.307  Andaman And Nicobar   India
24715 1796-07-01              27.482  Andaman And Nicobar   India
24716 1796-08-01              28.025  Andaman And Nicobar   India
24717 1796-09-01              27.185  Andaman And Nicobar   India
24718 1796-10-01              26.808  Andaman And Nicobar   India
24719 1796-11-01              26.646  Andaman And Nicobar   India
24720 1796-12-01              24.796  Andaman And Nicobar   India
24721 1797-01-01                 NaN  Andaman And Nicobar   India
24722 1797-02-01                 NaN  Andaman And Nicobar   India
24723 1797-03-01              26.717  Andaman And Nicobar   India
In [33]:
df_filtered.AverageTemperature.isnull().sum()
Out[33]:
5044
In [34]:
df_filtered.dropna(inplace = True, axis = 0)
C:\Users\athar\AppData\Local\Temp\ipykernel_22120\803214722.py:1: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

In [35]:
df_filtered.head(15)
Out[35]:
Date AverageTemperature State Country
24709 1796-01-01 26.534 Andaman And Nicobar India
24710 1796-02-01 26.294 Andaman And Nicobar India
24711 1796-03-01 26.180 Andaman And Nicobar India
24712 1796-04-01 27.942 Andaman And Nicobar India
24713 1796-05-01 28.651 Andaman And Nicobar India
24714 1796-06-01 28.307 Andaman And Nicobar India
24715 1796-07-01 27.482 Andaman And Nicobar India
24716 1796-08-01 28.025 Andaman And Nicobar India
24717 1796-09-01 27.185 Andaman And Nicobar India
24718 1796-10-01 26.808 Andaman And Nicobar India
24719 1796-11-01 26.646 Andaman And Nicobar India
24720 1796-12-01 24.796 Andaman And Nicobar India
24723 1797-03-01 26.717 Andaman And Nicobar India
24724 1797-04-01 27.111 Andaman And Nicobar India
24725 1797-05-01 27.856 Andaman And Nicobar India
In [36]:
#adding lat and long
lat = pd.read_csv("poptable.csv")
lat.head()
Out[36]:
State latitude longitude
0 Andaman And Nicobar 11.667026 92.735983
1 Andhra Pradesh 14.750429 78.570026
2 Arunachal Pradesh 27.100399 93.616601
3 Assam 26.749981 94.216667
4 Bihar 25.785414 87.479973
In [37]:
df_filtered['State'].unique()
Out[37]:
array(['Andaman And Nicobar', 'Andhra Pradesh', 'Arunachal Pradesh',
       'Assam', 'Bihar', 'Chandigarh', 'Chhattisgarh',
       'Dadra And Nagar Haveli', 'Daman And Diu', 'Delhi', 'Goa',
       'Gujarat', 'Haryana', 'Himachal Pradesh', 'Jammu And Kashmir',
       'Jharkhand', 'Karnataka', 'Kerala', 'Madhya Pradesh',
       'Maharashtra', 'Manipur', 'Meghalaya', 'Mizoram', 'Nagaland',
       'Orissa', 'Puducherry', 'Punjab', 'Rajasthan', 'Sikkim',
       'Tamil Nadu', 'Tripura', 'Uttar Pradesh', 'Uttaranchal',
       'West Bengal'], dtype=object)
In [38]:
lat['State'].unique()
Out[38]:
array(['Andaman And Nicobar', 'Andhra Pradesh', 'Arunachal Pradesh',
       'Assam', 'Bihar', 'Chandigarh', 'Chhattisgarh',
       'Dadra And Nagar Haveli', 'Daman And Diu', 'Delhi', 'Goa',
       'Gujarat', 'Haryana', 'Himachal Pradesh', 'Jammu And Kashmir',
       'Jharkhand', 'Karnataka', 'Kerala ', 'Madhya Pradesh',
       'Maharashtra', 'Manipur', 'Meghalaya', 'Mizoram', 'Nagaland',
       'Orissa', 'Puducherry', 'Punjab', 'Rajasthan', 'Sikkim',
       'Tamil Nadu', 'Tripura', 'Uttar Pradesh', 'Uttaranchal',
       'West Bengal'], dtype=object)
In [39]:
Final_df = pd.merge(df_filtered, lat)
In [40]:
Final_df.tail(15)
Out[40]:
Date AverageTemperature State Country latitude longitude
79121 2012-06-01 30.833 West Bengal India 22.58039 88.329947
79122 2012-07-01 28.667 West Bengal India 22.58039 88.329947
79123 2012-08-01 28.842 West Bengal India 22.58039 88.329947
79124 2012-09-01 28.496 West Bengal India 22.58039 88.329947
79125 2012-10-01 26.182 West Bengal India 22.58039 88.329947
79126 2012-11-01 22.359 West Bengal India 22.58039 88.329947
79127 2012-12-01 18.879 West Bengal India 22.58039 88.329947
79128 2013-01-01 16.974 West Bengal India 22.58039 88.329947
79129 2013-02-01 20.974 West Bengal India 22.58039 88.329947
79130 2013-03-01 26.121 West Bengal India 22.58039 88.329947
79131 2013-04-01 28.707 West Bengal India 22.58039 88.329947
79132 2013-05-01 29.694 West Bengal India 22.58039 88.329947
79133 2013-06-01 29.628 West Bengal India 22.58039 88.329947
79134 2013-07-01 29.115 West Bengal India 22.58039 88.329947
79135 2013-08-01 28.686 West Bengal India 22.58039 88.329947

I am saving this file to work on this in Tableau

In [41]:
Final_df.to_csv('Final_df.csv')
In [52]:
from IPython.display import Video
Video("Temperature change.mp4", width=500)
Out[52]:
Your browser does not support the video element.

To check the work click the link - https://public.tableau.com/views/TempChangeDataviz/Sheet1?:language=en-US&:display_count=n&:origin=viz_share_link

In Madhya Pradesh, the median average temperature in 1950 was 25.77°C, Rajasthan 27.35°C, and Uttaranchal 15.44°C. But in 2013, the temperature changed drastically. The median average temperature of Madhya Pradesh, Rajasthan, and Uttaranchal became 26.93°C, 29.64°C, and 18.75°C, respectively.

We might think 2°C is not more, but it is more than enough to cause glaciers to melt and can cause heatwaves. Without concomitant increases in precipitation, heatwaves can lead to water shortages and increased stress for plants, particularly in arid regions.

Check this article by NASA - https://climate.nasa.gov/news/2865/a-degree-of-concern-why-global-temperatures-matter/

Let's see the reasons behind the increase in the temperature¶

We Dwell in a Greenhouse gas. Energy from the Sun is essential to life on Earth. A majority of the light that reaches the Earth's atmosphere is absorbed by the atmosphere and clouds as it travels to the surface, where it is subsequently reflected upward as infrared heat. The greenhouse gases then absorb around 90% of this heat and reflect it back toward the surface.

We have data about the greenhouse gas emission. Let's find out if the we can related the change in the gas with the change in the temperature

In [44]:
gas_df = pd.read_csv("ghg-emissions.csv", index_col=0)
In [45]:
gas_df.head()
Out[45]:
CH4 CO2 N2O
Year
1990 511.25 342.41 147.25
1991 516.81 386.06 151.40
1992 518.28 405.13 155.55
1993 522.27 430.19 159.09
1994 525.93 464.74 164.80
In [46]:
fig = px.area(gas_df,
              title=" Greenhouse gas emission in India from year 1990-2019",
              labels = {
                  'value' : 'Greenhouse gas emission (measured in MtCO2e)'
              })
fig.show()

datasource = https://www.climatewatchdata.org/ghg-emissions

Over the past 30 years, we can that emissions of CO2, methane, and n20 have been rising. Due to rising industrialization, globalization, and the combustion of fossil fuels, in India the carbon dioxide makes up the bulk of greenhouse gas emissions. Studies have revealed that nitrous oxide and methane emissions are far more destructive in terms of their influence on climate change, even though they are smaller in comparison.

Effect of greenhouse gas: https://www.epa.gov/climate-indicators/greenhouse-gases#:~:text=An%20increase%20in%20the%20atmospheric,atmosphere%20increased%20by%2045%20percent.

This is all about India at last let us see how the GDP of countries is related to the carbon emission.¶

In [47]:
#reading the data
gdpcm_df = pd.read_csv("GDP vs Carbon Dioxide emissions (2018)_Full Data_data.csv")
In [48]:
gdpcm_df.head()
Out[48]:
Continent Country Name CO2 emissions GDP Population
0 Asia Afghanistan 7.59 18400000000 37171922
1 Europe Albania 5.32 15100000000 2866376
2 Africa Algeria 151.87 175000000000 42228415
3 Africa Angola 62.93 101000000000 30809787
4 South America Argentina 207.11 518000000000 44494502
In [49]:
import plotly.express as px
fig = px.scatter(gdpcm_df, x="GDP", y="CO2 emissions", color="Continent",
                 size='Population', hover_data=['Country Name'],
                 labels = {
                     'GDP': 'GDP (measure in USD)',
                     'CO2 emissions': 'CO2 emissions(measures in MTCO2e)'
                 },
                 log_x =True, log_y = True,
                 size_max=60)
fig.show()

Graph showing the logarithmic relationship between GDP and carbon dioxide emissions for various nations in 2018. Metric tons of carbon dioxide equivalent, or MTCO2e, is used to measure carbon dioxide emissions. USD are used to measure GDP.

Data source: https://data.worldbank.org/indicator/NY.GDP.MKTP.CD

https://www.climatewatchdata.org/ghg-emissions?breakBy=countries&end_year=2018&gases=co2&sectors=transportation&start_year=1990

The wealthier nations, or those with a greater Gross Domestic Product (GDP), contribute more to carbon dioxide emissions, as shown by the positive connection between GDP and carbon dioxide emissions for 2018.The correlation between the emissions and a nation's population serves as further evidence that greenhouse gas emissions are indeed caused by human activity. Also evident is how little the African countries contribute to global carbon dioxide emissions. Because of this, not all nations have the same share of the blame for greenhouse gas emissions. The Asian nations with a developing industrial sector and a shift toward industrialization and globalization are more responsible for the increase in carbon dioxide emissions. Many European nations with lower populations contribute significantly or nearly as much as those with bigger populations and comparable GDPs. However, the lack of emphasis on per-capita emissions in the climate agreements discourages developing nations from ratifying the treaty.

In [50]:
import plotly.express as px
fig = px.scatter(gdpcm_df, x="GDP", y="CO2 emissions", color="Continent",
                 size='Population', hover_data=['Country Name'],
                 labels = {
                     'GDP': 'GDP (measure in USD)',
                     'CO2 emissions': 'CO2 emissions(measures in MTCO2e)'
                 },
                 #log_x =True, log_y = True,
                 #size_max=60
                 )
fig.show()
In [ ]: